-
Notifications
You must be signed in to change notification settings - Fork 19.6k
Proposed fix for issue #21519: Reshape layer does not handle -1 shape… #21568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @roebel, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've implemented a fix for the "Reshape" layer in Keras to correctly handle cases where the "target_shape" includes a "-1" placeholder and the input batch size is "None". Previously, this scenario could lead to incorrect shape calculations. My changes ensure that the target shape is dynamically computed within the "call" method, rather than being statically resolved during "build", addressing issue #21519 and improving the layer's robustness.
Highlights
- Reshape Layer Initialization: I've added validation to the Reshape layer's "init" method to ensure that the "target_shape" argument does not contain more than one "-1" value, preventing invalid configurations. I also introduced pre-calculated attributes ("need_explicit_shape_for_batch_size_None" and "new_size_no_minus_one") to optimize dynamic shape computation.
- Dynamic Shape Resolution: I've removed the "build" method's logic for resolving the target shape, as this was causing issues with dynamic batch sizes. Instead, the "call" method now dynamically computes the final "target_shape" when the input batch size is "None" and a "-1" is present in the "target_shape", ensuring correct reshaping for partially defined input shapes.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses an issue with the Reshape
layer's handling of a -1
dimension in the target shape, particularly when dealing with dynamic batch sizes. The fix correctly moves the shape resolution logic from the build
method to the call
method, enabling dynamic calculation when input shapes are fully or partially known at runtime. The implementation is logical, but I've pointed out a small piece of leftover code that should be cleaned up to improve maintainability.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21568 +/- ##
==========================================
- Coverage 82.75% 77.19% -5.56%
==========================================
Files 567 567
Lines 56468 56480 +12
Branches 8818 8821 +3
==========================================
- Hits 46730 43601 -3129
- Misses 7577 10771 +3194
+ Partials 2161 2108 -53
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…e -1 shape infor dynamically.
…usively with the case that is not properly handled by ops.reshape. This is when the batch dimension does not have a static shape (shape == None) but all other dimensions have static shape. In that case we can determine the static shape of the dimension containing -1 form all dimensions that are not the batch dimension.
90a845f
to
a502050
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a unit test that fails before and passes after?
The Reshape
layer can handle of -1
both with static and dynamic dimensions (and dynamic batch size). Instead redoing the work of calculating the dynamically resolved shape, it just lets ops.reshape
handle it. No need to duplicated that work.
It does so by having self._resolved_target_shape contain
a -1
in it if it could not be statically resolved.
You find the test case in the new commit. For the context: When using convolutional models with time series data (voice processing), the time dimension is hardly ever known beforehand. You can probably force a dynamic dimension here, but for convolutional layers this is not required, and for the reshape layer it is not needed either.
I may not fully understand this, but I thought that the reshape layer has an advantage compared to the reshape operator, because it knows that the batch dimension stays the same. I wondered in fact why the test case: test_reshape_with_dynamic_batch_size_and_minus_one produces shape (None, 3, 8) and not (None, None, 8). As for example this does:
|
…in the test name.
I don't see any reason why the torch test fails. It runs fine here in my Keras development environment, and it also runs fine with the mini script I shared in the issue. |
I have implemented a fix that avoids precalculating a statically resolved reshape target in the build method and instead properly calculates the proper shape information for the problematic case when the shape information is not complete in the call method. The method appears to pass the reshape tests for all backends.